A comparison of the discrete Kolmogorov-Smirnov statistic and the Euclidean distance
نویسندگان
چکیده
Goodness-of-fit tests gauge whether a given set of observations is consistent (up to expected random fluctuations) with arising as independent and identically distributed (i.i.d.) draws from a user-specified probability distribution known as the “model.” The standard gauges involve the discrepancy between the model and the empirical distribution of the observed draws. Some measures of discrepancy are cumulative; others are not. The most popular cumulative measure is the Kolmogorov-Smirnov statistic; when all probability distributions under consideration are discrete, a natural noncumulative measure is the Euclidean distance between the model and the empirical distributions. In the present paper, both mathematical analysis and its illustration via various data sets indicate that the Kolmogorov-Smirnov statistic tends to be more powerful than the Euclidean distance when there is a natural ordering for the values that the draws can take — that is, when the data is ordinal — whereas the Euclidean distance is more reliable and more easily understood than the Kolmogorov-Smirnov statistic when there is no natural ordering (or partial order) — that is, when the data is nominal.
منابع مشابه
Empirical Processes , and the Kolmogorov – Smirnov Statistic Math 6070 , Spring 2006
1 Some Basic Theory 1 1.1 Consistency and Unbiasedness at a Point . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 The Kolmogorov–Smirnov Statistic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.3 Order Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.4 Proof of the Kolmogorov–Smirnov Theorem . . ....
متن کاملKOLMOGOROV-SMIRNOV TEST TO TACKLE FAIR COMPARISON OF HEURISTIC APPROACHES IN STRUCTURAL OPTIMIZATION
This paper provides a test method to make a fair comparison between different heuristics in structure optimization. When statistical methods are applied to the structural optimization (namely heuristics or meta-heuristics with several tunable parameters and starting seeds), the "one problem - one result" is extremely far from the fair comparison. From statistical point of view, the minimal requ...
متن کاملDistribution Fitting 2
The methods measuring the departure between observation and the model were reviewed. The following statistics were applied on two experimental data sets: ChiSquared, Kolmogorov-Smirnov, Anderson-Darling, Wilks-Shapiro, and Jarque-Bera. Both investigated sets proved not to be normal distributed. The Grubbs’ test identified one outlier and after its removal the normality of the set of 205 chemica...
متن کاملA Comparison of the Euclidean Distance Metric to a Similarity Metric based on Kolmogorov Complexity in Music Classification
This work 1 studies music classification using the 1-Nearest Neighbor rule comparing the Euclidean distance metric to an information distance metric based on Kolmogorov Complexity. The reason for this comparison is two-fold. First, to understand the music classification task and how similarity measures play a role. Secondly, to contribute to the knowledge regarding effective similarity measures...
متن کاملFuzzy Empirical Distribution Function: Properties and Application
The concepts of cumulative distribution function and empirical distribution function are investigated for fuzzy random variables. Some limit theorems related to such functions are established. As an application of the obtained results, a method of handling fuzziness upon the usual method of Kolmogorov–Smirnov one-sample test is proposed. We transact the α-level set of imprecise observations in ...
متن کامل